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A  MARKOV  DECISION  MODEL  FOR  COMPUTER 


AIDED  INSTRUCTION 


I.  INTRODUCTION 

Several  researchers  have  been  interested  in  the  application 
of  optimization  techniques  to  models  of  learning  and  instruction. 
Karush  and  Dear  (1966)  developed  an  optimal  strategy  for  teaching 
students  to  learn  a  list  of  independent  items.  The  basic  assumption 
was  that  an  item  is  either  in  a  learned  or  unlearned  state.  If  it  is 
given  in  the  unlearned  state  it  goes  to  the  learned  state  with 
probability  c,  while  once  it  reaches  the  learned  state  it  stays 
there.  Atkinson  and  Paulson  (1972)  described  experiments  in 
which  extensions  of  this  model  were  applied  to  computer-assisted 
spelling  instruction  with  elementary  school  children.  Chant  and 
Atkinson  (1°73)  developed  an  optimization  technique  for  allocating 
inr^ructional  effort  to  two  interrelated  blocks  or  strands  of 
learning  material.  Their  key  assumption  was  that  the  learning 
rate  for  each  of  the  two  strands  depends  solely  on  the  difference 
between  the  achievement  levels  on  the  two  strands. 

The  model  of  this  report  concerns  a  system  where  a  student 
is  to  be  taught  to  perform  a  certain  skill  at  a  given  level  of 
competence.  He  achieves  this  by  working  problems  through  or 
taking  tests  at  various  levels  of  difficulty.  It  is  assumed  that 
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{£  a  scudent  is  able  to  perform  successfully  at  one  level  of  difficulty 
he  is  able  to  perform  at  the  next  lower  or  preceding  level  of  difficulty 
and  consequently  at  all  lower  levels  of  diffic  ilty.  This  assumption 
is  particularly  applicable  in  the  following  two  situations. 

The  first  situation  is  one  where  the  material  covered  at  one 
level  includes  all  that  covered  at  preceding  levels,  plus  some 
additional  material.  An  example  of  this  is  a  program  developed 
at  Behavioral  Technology  Laboratories  to  teach  students  Kirchoff's 
laws.  This  course  is  comprised  of  eleven  levels  with  the  lowest 
level  defining  the  units  for  voltage,  current  and  resistance  up  to 
the  highest  level  which  deals  with  the  application  of  Ohm's  law 
and  Kirchoff's  voltage  and  current  laws  in  complex  networks. 

The  second  situation  is  one  where  the  material  and  problems 
covered  at  a  particular  level  are  virtually  the  same  as  at  the 
immediately  preceding  level  except  more  clues  and  hints  are  given 
at  the  preceding  level.  A  good  example  of  this  would  be  a  veision 
of  the  Kirchoff's  laws  program  considered  earlier  at  Behavioral 
Technology  Laboratories  in  which  problems  would  be  given  at  the 
following  levels: 

1.  Problems  are  given  in  steps  with  cues  and  knowledge 
of  results  at  each  step. 

2.  Problems  are  given  in  steos  with  no  cues  or  knowledge 
of  results  at  each  step. 

3.  The  student  solves  problems  in  steps  but  he  chooses  the  steps. 
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4.  The  student  is  simply  given  problems  and  asked  to  solve 
them. 

Note,  however,  the  assumption  given  for  this  model  would  not 
be  applicable  for  the  situation  where  a  given  level  did  not  use  certain 
material  introduced  at  preceding  levels. 

It  is  also  assumed  that  if  a  student  performs  successfully  at 
one  level,  it  will  increase  his  probability  of  being  able  to  perform 
successfully  at  the  next  higher  level.  The  student  completes  the 
course  when  he  performs  successfully  at  the  highest  level.  The  aim 
of  the  model  presented  in  this  paper  is  to  choose  the  levels  at  which 
problems  should  be  assigned  in  the  course  sequence  so  the  expected 
time  required  by  the  student  to  complete  the  course  is  minimized. 

II.  THE  MODEL 

Mathematical  Formulation 

The  problem  of  instructing  the  student  so  that  he  completes  the 
course  in  minimum  time  is  formulated  as  a  Markov  decision  process. 
The  set  of  actions  are  1, .  .  .  ,  N  where  action  i  is  that  of  giving  the 
student  a  problem  at  level  i.  The  levels  are  numbered  in  decreasing 
order  of  difficulty.  Thus  level  1  is  the  hardest  and  level  N  the 
easiest.  The  state  0  is  that  in  which  the  student  has  performed 
successfully  at  level  1.  The  states  in  which  the  student  has  not 
performed  successfully  at  level  1  are  characterized  by  the  vectors 
p  -  (pj,  ...»  pn)  where  p^  equals  the  probability  that  the 
student  will  correctly  do  a  problem  at  level  i.  It  is  assumed 
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that  if  a  student  can  do  a  problem  at  level  i,  he  can  also  do  it 
at  level  j  for  all  j  >  i.  Thus,  is  non-decreasing  in  i. 

For  each  action  i,  let 

=  P  [student  can  perform  at  level  i  -  l/ student  completes 
problem  at  level  i  correctly  and  could  not  perform 
at  level  i  -  1  before]  . 

Thus,  if  the  state  is  p  and  we  perform  action  1,  we  go  to 
state  0  with  probability  p^  and  remain  in  state  p  with  probability 
(1  -  p^).  If  we  take  action  i  >  1  we  go  to  state  p  with  probability  p. 

where  p._^  =  p._^  +  qjl  -  P^)»  p^  =  p^,  k  f-  i  -  1,  and  remain  in  state 

p  with  probability  (1  -  p.). 

Equivalently,  the  components  of  p  above  may  be  represented 
by  the  following: 

h.i =  (1  -  ’i’  Pi-i + 

(1) 

\  =  Pj^.  k  f-  i-1  . 

Once  the  system  reaches  state  0,  it  remains  there.  Associated  with 
each  action,  i,  is  a  cost  c^  which  may  be  equal  to  the  expected  time 
it  takes  to  attempt  a  problem  at  level  i.  It  is  desired  to  choose  an 
action  policy  that  reaches  state  0  at  minimum  cost. 

Some  Solution  Properties 

A  policy  specifies  an  action  for  each  state  of  the  system  other 
than  state  0.  Let  V(ir,  p)  be  the  total  expected  cost  under  policy  ir 
when  the  system  is  in  state  p.  If  i  is  the  action  specified  for  state  p 
by  ir,  then  it  follows  that: 
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V(TT,  p)  =  C{  +  p.V(TT,p)  +  (1  -  pi)V(TT,  p) 

(2) 

where  p.  j  =  (1-q.)  p.^  +  q.  and  ^  =  pk  for  k  £  i-1. 

It  is  of  course  desired  to  find  tt  so  that  V(tt,  p)  <  V(n,  p) 
for  all  p  and  all  n. 

Note  that  if  action  i  is  taken  and  the  student  does  not  complete 
his  task  at  level  i  successfully,  the  state  does  not  change  and  action 
i  will  be  repeated.  Thus  any  action  taken  will  be  repeated  until  the 
student  completes  a  problem  correctly  at  which  time  the  state  changes 
anc*  a  new  action  may  be  taken.  In  addition,  the  state  resulting  from 
the  first  time  the  student  performs  correctly  at  level  i  is  independent 
of  the  number  of  attempts  it  takes  the  student  to  perform  successfully 
at  that  level.  Thus  tt  and  p  determine  a  sequence  of  correct 
responses  at  each  level  though  not  the  number  of  trials  necessary 
to  obtain  these  responses.  Of  course,  the  sequence  must  end  wi^h 
one  correct  response  at  level  1.  The  expected  number  of  trials 
necessary  for  the  student  to  complete  a  problem  successfully  at 
level  i  is  l/p.  and  the  expected  cost  of  this  is  c./p.. 

l  i' 

Consider  the  policy  that  requires  performance  at  level  1  only 
for  p.  The  cost  of  this  policy  would  be  Cj/pj  which  would  be  less 
than  that  of  any  policy  requiring  more  than  Cj/c^  successful 
performances  at  level  i.  Hence,  the  set  of  performance  sequences 
for  p  that  are  superior  to  testing  at  level  1  only  is  finite  and  there 
is  an  optimal  sequence  for  each  p.  This  establishes  the  existence  of 
a  policy  tt  satisfying  V(tt,  p)  <  V(w,  p)  all  p  and  all  H. 


-5- 


We  are  now  ready  to  prove  the  following  theorems 


Theorem  1;  If  p 
optimal  policy. 


r  —  0»  then  V(t r,  p)  <  V(tt,  r),  if  rr  is  an 


Proof:  If  it  consists  of  one  correct  response  (at  level  1)  for 
both  p  and  r,  then  V(ir,p)  =  c^  while  V{ir>r)  =  ^  and  the 

theorem  holds.  Suppose  the  theorem  holds  for  all  p  and  r  such 
that  tt  specifies  n  or  fewer  total  correct  responses  for  r. 

Consider  p  and  r  such  that  tt  requires  n  +  1  or  fewer  correct 
responses  for  r  and  let  i  be  the  level  at  which  the  first  correct 
response  for  state  r  must  take  place.  Then  V(rr,  p)  <  c./p.  +  V(tt,  p) 
and  V(tt,  r)  =  c./r.  +  V(Tr,r)  where  from  (1)  p.  = 


(1  -  q;>  Pi.!  +  V  f-.i  =  (1  -  ,.)  r._j  +  q., 


Pk  =  Pk  and 


rk  = 


for  k  f-  i- 


Thus  p>r,  r  requires  n  or  fewer  correct  responses  and 
V(tt,p)  <  V(tt,  r).  The  theorem  follows  from  induction. 


Theorem  2:  There  is  an  optimal  policy  tt  such  that  if 

al'  a2’  •  •  ‘ '  an  is  the  sequence  of  the  levels  of  correct  responses  for 
state  p,  then  ^  >  a^  all  k. 

Prooh  Let  tt  be  an  optimal  policy  and  a^  a.,, . . . ,  &n  be  the 
sequence  specified  for  p.  Suppose  there  is  a  k  such  that  ^  ^  . 

Let  p  be  the  state  and  c  the  expected  cost  resulting  from  the  correct 
responses  to  the  sequence  a^  for  the  initial  state  p. 

Consider  the  policy  *  which  differs  from  tt  only  in  that  aR  and  afc 

are  interchanged  in  the  sequence  for  p  and  let  i  =  a.  ;  -  _ 

Tc+1’  J  ”  “k* 
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If  j  <  i-l»  then  V(ir,  p)  =  V(f,p)  =  c  +  c./p.  +  c./p  +  V(tt,  r)  where  from  (1), 

J  J 

ri-l  =  (1'qi)  Pi-1  +  qi*  rj_i  =  d-flj)  Pj_j  +  qj»  rjL  =  p^  all  othei  l  and 
consequently  V(ir,  p)  =  V(ir,p).  Also  from  (1),  if  j  =  i-1,  then 
V(ir,  p)  -  c  +  Cj/p^  +  ci-l^Pi_i  ^  ^Tr*  r)»  where  r  is  defined  above  while 
V(u,  p)  =  c  +  c./p.  +  cui/  [(l-qi)pi_1  +  q.]  +  V(ir,  r).  Thus  V(nf  p)  <  Vfr,  p) 

and  tt  is  also  optimal.  Continuing  in  this  manner,  an  optimal  sequence  for 
p  is  eventually  obtained  in  which  the  members  of  the  sequence  are  in 
non-increasing  order.  The  theorem  follows  since  p  is  general. 

Thus  the  search  for  an  optimal  policy  may  be  confined  to  those 

which  yield  a  sequence  of  correct  responses  at  levels  that  are  non-increasing 
in  the  sequence. 

As  noted  before,  if  one  elicits  one  correct  response  at  level  i, 

Pi_l  is  transformed  to  (1-qj)  p._^  +  q^.  Applying  this  transformation 

recursively  it  follows  that  if  one  elicits  k  correct  responses  at  level 
i,  is  transformed  to  (l-q.)k  p._1  +  (1-q.  )k_1  q.  +  .  . .  +  q.  which 

k 

sums  to  1  -  (1-q.)  (l-p^_^).  Thus,  if  tt  is  a  policy  of  the  above 

type  and  specifies  k(i)  correct  responses  at  ievel  i  for  p,  then 

N 

V(Tr,p)  =  E  k(i)  c./p.  where 
i=l  1  1 

~PN  =  PN  (3) 

Pj  =  l-(l-p.)  (l-qi+1)k(i+1)for  i  <  N 
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Let  Vn*p)  be  tho  minimum  cost  for  state  p  if  we  restrict 
instruction  to  levels  i, . . . ,  n.  That  is  no  instruction  takes  place 
at  levels  n  +  i,  n  +  2,...,  N.  Of  course,  only  the  first  n 
components  of  p  are  relevant  in  determining  Y’n(p)  and  throughout 

the  remainder  of  this  paper  it  will  be  assumed  that  p  is  restricted 
to  p1,  ...,pn  in  VJp). 

In  other  parts  of  this  paper,  the  symbol  p  will  be  used  to 
represent  restrictions  of  p  to  certain  components  where  the 
restriction  is  obvious.  In  particular,  in  the  expression  V  (p  d  ) 

n'r» 

p  represents  the  restriction  of  p  to  p.,  . . . 

i  L  rn - 1 

From  Theorem  2  and  (3)  if  follows  that 


Vn(p)  • 


min 

k 


k<P>] 


where 


^  (P)  =  k  cn/pn  +  Vn  l  (p.  l-(l-qn)k(l-PnJ)) 

Vl<p)  =  Cj/pj  . 


(4) 


Note  that  in  the  right  hand  side  of  the  second  line  of  (4)  p 
represents  the  restriction  of  p  to  Pj,  p2,  . . . ,  pn 

Algorithm  for  Two  Levels 

For  the  two  levels  problem  it  follows  from  (4)  that 


V2(p)  =  k  ’  KM  where 

(P)  =  k  c2/p2  +  cx/  (1  -  (l-q2)k  (i_Pi)) 


(5) 
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Consequently, 


V2<P)  '  V^"1<P)  =  =2/P2  +  c/(1  -(l-q2)k(l-Pj))  -  c1/(l-(l-q2)k'1a-p1)), 

V-j(p)  <  V->  an<*  only  if  the  expression  in  (6)  is  less  than  or 

equal  to  zero.  This  happens  if  and  only  if 


(6) 


P2  —  f(k>Pj)  where 


f(k,  px)  =  c2 


(l-q2)  (1-Pl)  -  <2-q2)  + 


1 


k-1 

d-q2)  d-Pi ) 


Yfc  (7) 


for  k  =  1, 2.  .  . 


This,  requiring  a  student  to  perform  successfully  k  times  at 
level  2  is  preferred  to  requiring  him  to  perform  successfully  k-1 
times  at  level  2  if  P2  >  f(k»  Pjb  while  requiring  him  to  perform 

successfully  k-1  time  at  level  2  is  preferred  if  p2  <  f(k,  p  ),  and 

these  two  strategies  yield  equal  costs  if  p  =  f(k,  p  ). 

Theorem  3:  In  (7),  f(k,  p^)  is  nondecreasing  in  k  and  is 
non -negative  in  all  U. 


Proof;  Substituting  in  (7)  and  rearranging  terms  one  obtains 
f(l»  Pi)  -  c2  ^  P^2  "*■  U"Pj ))  /  U"Pj)  ^  /c|  q2  which  is  clearly  non¬ 
negative.  Also,  f(k,Pl)-  f(k-l,p1)=c2|’l/(l-q2)k-1(l-p1)-(l-q2)k-1(i.pi)  1 


/  c 


which  is  non-negative  since  the  positive  term  of  the  second  factor  in 
the  numerator  exceeds  1  while  the  negative  one  is  less  than  1. 
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Thus,  f(k,  p^)  is  increasing  in  k. 


Q.E.D. 


Theorem  4;  Define  f(0,  Pj_)  =  0  and  f(k,  Pj)  as  in  (7)  for 

k  =  1,2,  ,  m,  where  m  is  such  that  f(m,  Pl)  >1  and  f(m-l,  Pl)  <  1. 

Then  the  value  of  k  that  minimizes  (5)  is  that  which  satisfies 
f(k,  Pl)  <  P2  <  f(k  +  1,  Pl). 

it  follows  immediately  from  Theorem  3  that 

^(P)  <  ^  (P)  <  *  *  *  <  (P)  and  similarly  if  i  >  k,  V^(p)  <  V^+1  (p) 

C*  Ct 

<  •••<  V*(p). 

Thus  from  Theorem  4,  for  fixed  p^,  the  number  of  successful 
performances  required  at  level  2  to  minimize  cost  is  an  increasing 
step  function  of  p2>  starting  at  0  for  P2  =  0  and  advancing  in 

increments  of  one.  The  minimizing  cost  may  be  found  from  (5)  once 
k  is  known. 

Additional  Solution  Properties 

In  calculating  vn(p),  it  is  much  more  difficult  to  get  a  closed 

form  such  as  that  for  V^p)  and  V2(p).  However,  as  this  section 

will  show,  for  fixed  p^,  p^.,  ,  .  .  .  ,  Pj,  the  value  of  k  that  minimizes 

(4)  is  a  non-decreasing  step  function  inp  with  a  value  of  0  at  d  =0 

n  *n 

However,  the  increments  of  the  step  function  are  not  necessarily  one. 
The  next  lemma  and  two  theorems  show  this.  First,  define 
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(8) 


yj.k.p)  =  (k-j)  C J  [vn_1(p1l-(l-qn^ 

for  j  <  k,  p  =  (p  . p  )  . 

c-  n 

Lemma  5:  In  (8),  fn(j,k,p)>0 

f0r  Pn  <fn(j*k*P)i  Vn<P)<  V^P)  *> 

holds  for  pn  =  fn(j  k,p). 


when  defined  and  V^(p)  >  V^(p) 
r  Pn  >  ^n(j*k,  p);  and  equality 


~r-9 ° P*  The  denominator  of  fn(j,  k,  p)  is  non-negative  by 

Theorem  1.  The  theorem  then  follows  from  the  definition  of 
V^(p)  in  (4). 


Theorem  6:  The  number  of  successful  performances  required 
at  level  n  in  order  to  realize  Vn(p)  is  non-decreasing  in  pn. 

Proof:  For  i  =  1  the  theorem  obviously  holds  since  only  one 
successful  performance  is  required  for  all  pr  For  n  >  1  assume 
the  theorem  is  false.  Then  there  is  a  system  with  k  >  1,  p  <  p 


such  that  Vn(P.Pn)  <  Vn(P.Pn>  and  V^(p,  p„)  <  v£(p,  p„).  Let 
fn0»k»p)  be  as  defined  in  Lemma  5.  Then  pn  <fn(j,k,  p)  and 
Pn  >  ^nU*k,  p)  for  a  contradiction. 


Thus,  it  has  now  been  shown  that  for  fixed  p^  p2> .  . .  ,  the 

value  of  k  that  minimizes  Vk(p)  is  a  non-decreasing  step  function 

of  p  . 
rn 

Of  course,  for  pn  =  1,  the  minimizing  value  of  k  cannot  exceed 
Vn-l^p^cn’  since  value  exceeding  this  would  be  inferior  to  k  =  0. 
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Also,  fn(j,k,p)  >  c J  [  Vn  l(p,l-  (l-qn)J  (l~Pn_1))  - 

Thus,  if  Vn  (p,l-(l-qn)1  (l-pn_1))  <  cn  +  vn  l(p,  i),  fn(j,k,p)>  1 

and  j  successful  performances  is  preferred  to  k  successful 

performances  for  all  k  >  j.  Thus,  the  sequence  of  optimal  values 

of  k  in  (4)  is  a  subsequence  of  the  set  0,  1,  2,...,  min  {[vn  ^(p)/^],  j| 

where  j  is  the  smallest  integer  satisfying 

v„_i(p.i-  (i-qn)J(i-Pn.i»  <  cn  +  vn  l(p,i). 

Theorem  7:  Suppose  j  <k<  1  and  f^'.k.p)  >  fn(k,^,p)  and 

Pj,  P2»,,,*Pn_i  are  fixed-  Then  V^(p)  f  Vn(p)  for  any  pn> 

Pro°fi  For  pn  <  fn(j,k,p),  VJn(p)  <  V^(p)  while  for  pn  >  fn(j,k,p), 
Pn  >  Vk*  P)  and  Vn(p)  <  V^(p)' 


Theorem  8:  Suppose  n(l),  n(2) n(m)  is  an  increasing 
sequence  of  integers  such  that  fn(n(i),  n(i+l),  p)  is  increasing  in 

i  and  that  V^p)  f  Vn(p)  for  any  pn  if  j  is  not  in  this  sequence. 

Then  V^^(p)  =  V^p)  for  fn(n(i-l),  n(i),p)  <pn<fn(n(i),  n(i+l),p). 


Proof:  For  pn  >  fR  (n(i-l),  n  (i),  p),  V^(i)(p,  pn)  <  V^^p,  pj 

<*••<  Vn(1)(P»Pn)  ^r  pn<  fn(n(i),  n(i+l),p),  \^(i)(p,pn)  1 

Vnli+1’<P-P„>  <  Vnm)(P-Pn>-  Q- E- D- 
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This  means  one  may  start  with  the  sequence  0,1,  ... , 


min 


{[Vn  (p)/c  ].j  }  ,  where  j  is  the  smallest  integer  such  that 

Vn  ^p,  l-(l-qn)J(l-pn  y))  <  cn  +  vn  3_(p)*  eliminate  those  members 

that  cannot  be  optimal  for  any  pn  by  Theorem  7,  continue  to  eliminate 
from  the  remaining  sequence  until  the  sequence  left  satisfies  .tie 
conditions  of  Theorem  8.  This  procedure  must  be  finite  since 
only  a  finite  number  of  eliminations  may  occur. 

General  Algorithm 

Formally,  the  algorithm  for  finding  the  value  of  k  that 

minimizes  v  (p)  as  a  function  of  p„  is  as  follows: 
n  n 

Algorithm  1 

1.  Set  n  =  1  and  Vn(p)  =  Cj/pj. 

2.  If  n  =  N,  terminate.  Otherwise  increase  n  by  one  and 
define  n{0),  n(l),  .  .  .  ,  n(m)  where  n(i)  =  i  and 

m  =  min  {[V^tpj/cJ  ,  j}  where  j  is  the  smallest 

integer  satisfying  vn_^P,  1  "  ^i_cln^^"Pn-l^  — 
cn+ 

3.  Compute  fn(n(i),  n(i+l),  p)  for  i  =  1, ...» k-1  according 
to  the  formula  in  (8).  For  n  >2,  Vn_j(p)  may  be 
calculated  by  Algorithm  2. 

4.  If  no  i  satisfies  fn(n(i),  n(i+l),p)  >  fn(n(i+l),  n(i+2),p) 
delete  from  the  sequence  any  i  such  that  fn(n(i),  n(i+l),  p)  >  1 
and  return  to  2  as  the  value  of  k  which  minimizes  Vn(p) 
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is  that  which  satisfies  fn(n(i-l),  n(i),  p)  <  pn  <  fn(n(i), 

n(i+l),p).  Otherwise  delete  any  i  from  the  sequence 
satisfying  fn(n(i),  n(i+l),  p)  >  fn(n(i+l),  n(i+2),p),  relabel 

the  members  of  the  remaining  sequence  n(0),  n(l),  .  .  . ,  n(m) 
in  increasing  order  with  m  +  1  equaling  the  number  of 
elements  in  the  remaining  sequence  and  return  to  step  3. 

Given  the  sequences  generated  by  algorithm  1,  the  optimal 
number  of  successful  performances  to  require  at  each  level  and  V^(p) 
for  i  >  1  may  be  found  as  follows. 

Algorithm  2 

1.  Set  m  =  n  and  define  pn  =  p^,  k(n)  as  the  n(j)  that 
satisfies  fn(n(j-l),  n(j),  p)  <  pn  <  fn(n(j),  n(j+l),p). 

2.  Decrease  m  by  one. 

3.  Define  pm  =  1  -  (l-pm)(l-qm+^)k  .  Then  define 

k(m)  as  the  n(j)  that  satisfies  fm(n(j-l),  n(i),p)  <  Pm 
<  f(n(j),  n(j+l),  p). 

k(2) 

4.  If  m  =  2,  define  p1  =  1  -(l-p^)(l-q2)  and  go  to  5. 
Otherwise  go  to  2. 

n 

5.  Terminate.  V  (p)  =  E  k(i)c./p^  where  k (1)  =  1. 

i=l 

III.  ESTIMATION  OF  PARAMETERS 
Maximum  Likelihood 

The  past  performances  of  students  may  be  used  to  obtain  a  maximum 
likelihood  estimate  of  the  p  and  q  input  vectors. 

Let  and  b^  respectively  be  the  number  of  incorrect  and 
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correct  responses  of  students  at  level  i 


who  had  given  k  correct 


responses  at  levs-  i  +  1,  and  define  L(p.q>  as  the  likelihood 
function  of  the  vector  (p,q).  It  follows  from  (3)  that 


n  m  a,  . 

L(p.q)  =  n  n  (l-p.^d-q 

i=l  k=0  1 


i+1 


)  <“ki  (l-(l-p.)  (l-qi+1)k)bki 


(9) 


Taking  the  partial  derivatives  of  L(p,q)  with  respect  to  all  p.  and 
q£  one  obtains  (10)  and  (11)  setting  them  to  zero  yields  (12)  and  (13) 


9L(p,q) 

9Pi 


L(p.q) 


-\i 

i-Pi 


+ 


(10) 


9L(p.  q) 
9qi+i 


m 

L(p.q)  E 

k=  0 


+ 


kbki 

1-(1-Pi)  (l-qi+1)k 


(11) 


m  m 

£  a,  .  =  E 
k=0  ^  k=  0 


bki(1-Pi><1-9it,)k 

l-<l-Pi)(l-qi+1)k 


(12) 


m 


2  k3ki  = 
k=0 


™  kbki(1-Pj)  <1-qj+])k 

k=o  1-(1-pi)(i-qi+1) 


(13) 
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Note  that  in  (12)  and  (13)  the  only  unknown  that  parameter  p 
depends  on  is  q.+1  and  vice  versa.  Thus  the  p  and  q  vectors  may 
be  estimated  by  solving  sets  of  two  simultaneous  equations  in  two 

unknowns.  However,  there  is  no  analytical  way  of  solving  these 

equations  in  general  for  p.  and  q.+1>  Nevertheless,  suppose 
one  takes  into  acco-  it  only  the  values  of  k  with  the  hig’  *t  number 
of  observations.  Denote  these  values  by  r  and  s. 

Then  from  (12)  and  (13)  one  obtains 


a  .  +  a  . 
ri  si 


b  .(1-p.^  (1-q  )r 

_  tv  pj  u  4j+h 
l-d-Pj)  d-q,+1)r 


bsi(1-pi)|1-lW 

l-(l-p.)(l-q.+1)E 


(H) 


r  a  .  +  s  a  .  = 
ri  si 


l-(l-Pi)(l-qi+I)r 


l-(l-Pl)(l-q.+1)' 


(15) 


Subtracting  r  times  equation  (14)  from  equation  (15)  yields 


s-r)bsi(1-Pi)(l-qi+i) 

l-(l-Pi)(l-q.+1)s 


which  yields 


hsjd-Pjm-Vi)8 

i-(1-Pi)(i-qi+1ls 


(17) 


-16- 


b  .(l-p.)(l-q.  )r 
ri  *Vv  Hj+i' 


(18) 


a  .  = 
ri 


Ml-Pi><l-q.+1)r 


From  (17)  and  (18)  one  obtains 


a-pi)»-qi+i)’ 


a  . 
n 


a  .  +  b  . 
n  ri 


a-Pi>u-qi+i)! 


si 


a  .  +  b  . 
si  si 


(19) 


(20) 


where  Xr  and  Xg  are  the  proportion  of  unsuccessful  performances 
at  level  i  given  r  and  s  successful  performances  at  level  i  +  1 

respectively.  Solving  these  two  equations  for  estimates  of  p.  and 
q.+l  one  obtains 

s  -r 

s-r  s-r 

Pi  =  1  -xr  X 

1  r  s 

1 

«i  =  1-<Vs/’r 

Confidence  Intervals 

Consider  the  two  Bernoulli  random  variables.  X  and  X 

r  s’ 

with  parameters  (l-Pi)(l-q.+1)r  and  (1  - p. ) (1 -q.+i) s  respectively. 

From  (19)  and  (20)  it  follows  thr.t  Xr  and  Xg  are  the  sample 
means  of  such  random  variables.  Thus,  it  follows  that  for  large  sample 
sizes,  a  10C  (1-p  )  percent  confidence  interval  for  (l-p^)(l-q  )r  is  given 
by: 
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(21) 

(22) 


(23) 


X  +  -JL _ 

r 


and  for  (1-p.)  (l-q.+1)s  by 


(24) 


VXs<*-Xs) 


(25) 


(26) 


where  Z  =  Z^/2;  that  i,  P(Z<Z^/2]  ,  1  .  f>  /2  where  Z 
is  standard  normal.  * 

Since  the  probabilities  that  (Llr,  L2r,  and(Lls.L2s)  bracket 

(i-Pj)  U-qi+1)  and  (1-p.)  (l-qi+1)s  are  each  1-p  and  independent,  the 
probability  that  both  of  these  bounds  hold  is  l-2p  +  p  2  >  l_2  p  . 


*Larson  (1966). 
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x  r 


From  this  it  follows  that  and  L^/L^  form  a  100  ,  percent 

confidence  intereal  for  (l-q;>8-r.  Similarly  it  follows  that  L8  /Lr 

lr'  2  s 

and  L2r^Lls  form  100  U-P)  percent  confidence  interval  for  (l-p.)8"r  . 

These  sets  of  bounds  taken  to  the  l/(s-r)  power  give  100(l-2p  ^‘percent 

confidence  intervals  for  Q-a  \  and  _  ..  „ 

"i+1  and  respectively.  Consequently 

a  100(1- a)  percent  confidence  interval  for  p.  is  given  by 

1  1 


P1  =  1  "  <L2r/Lls  > 


r  .s-r 


(2  7) 


(28) 


and  for  q._^  by 


Q1  =1-<L2,.V'r 


(29) 


Q2  =l-(Lls/L2r) 


s-r 


(30) 


where  Llg,  L2g  ,  Llr,  L2r  are  as  defined  in  (15  -  18)  and  p  =  (l-a/2)s“r  . 


Note  that  the  above  simplifies  if  s-r  =  1  and  that  this  is  likely 
to  happen  as  the  two  most  likely  choices  for  the  number  of  successful 

performances  to  require  at  a  given  level  are  likely  to  be  consecutive 
integers. 
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A  Linear  Estimation  Model 


In  this  model  the  student  is  given  a  questionnaire  to 
determine  his  level  of  competence.  Each  question  is  scored  one 
or  zero,  depending  on  whether  the  student  answers  the  question 
right  or  wrong.  The  p^  1  s  and  q  s  for  a  student  are  each 
assumed  to  be  linear  combinations  of  the  scores  he  receives 
on  a  question.  Thus, 


£ 

j 


u..w. 
lJ  J 


(31) 


*i 


£  v..w. 

i  1J  J 


(32) 


where  w.  is  the  score  on  question  j  for  j  >  1  and  w^  =  1. 
This  yields,  for  the  vector  pair  (u,  v),  the  likelihood  function 


Mu,  v)  = 


l  l,y-  j  w  <*■  f  wvk  yy-  ?  w 


(i 


f  Vi+1"jf 


)k)| 


) 

(33) 


where  w..  is  the  score  of  student  or  trial  t  on  question  j  and 
J* 

X.  =  j  l  ;  student  responds  incorrectly  at  level  i  } 

=  {  l :  student  responds  correctly  at  level  i  ) 
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Note  that  in  (33)  k  is  actually  a  function  of  f. 

Taking  partial  derivatives  of  L(u,  v)  one  obtains: 


8aM“pa  =  Hu,*) 

Uij 


s 


Ill 


.  1-  Iu..w,( 

feX.  ij  jf 


I 


w.  H-  EV....W..) 

jfv  j  i+lj  jf' 


<«X.  l-U-  j 


(34) 


L(u.v, 


I 


-kw. , 

_ 1L 


.  v  1-  Zv,,,.w, , 

*eXi  i+lj  jf 


k(l-Eu..wj|)|l-Evi+ljw./-1 


feX. 


1-(1 


Eu..w  )(1-  Ev.  .w.  ) 
j  1J  K  j  i+lj  Jl 


(35) 


Setting  both  derivatives  equal  to  zero,  one  obtains: 

>k 


I 


v  VTi«jV 


I 


<«*,  u  f  %•»)/  nx.  »-<*-  s  “ijV(1-  fvi+ijV 

*'  J  J 


(36) 


I 


kw 


iL 


HX.  '■  ?vi+ljwj*  ,7x.  ‘-<1-  S  Wj,'’ 


kV‘-  f  UijWj;)(1‘  f'.Hj”)/"1 

I 


(37) 
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Since  different  values  of  f  lead  to  different  denominators  in  the 

terms  of  (36)  and  (37) ,  some  simplification  of  these  expressions  in  needed. 
Let: 


number  of  one  scores  for  question  j  resulting  in 
successful  performances  at  level  i  following  k 
successful  performances  at  level  i+1. 

number  of  zero  scores  for  question  j  resulting  in 
successful  performances  at  level  i  following  k 
successful  performances  at  level  i+1. 


w 


ijk 


number  of  one  scores  for  question  j  resulting  in 
unsuccessful  performances  at  level  i  following  k 
successful  performances  at  level  i+1. 

number  of  zero  scores  for  question  j  resulting  in 
unsuccessful  performances  at  level  i  following  k 
successful  performances  at  level  i+1. 


Suppose  further  that  only  question  j  is  to  be  considered  on  the 

questionnaire.  Then  (36)  and  (37),  when  applied  to  u...  v.  .  u  v 

ij  i+lj'  iO’  i+10 

become: 


m 

Z 


w' .. 


ijk. 


k=0  1-u. . -u.  n 

ij  iO 


m 
=  Z 


Wlik<1"Wyi+10> 


k=0  l-(l-uirui0)  U-vi+lj-vi+10)' 


(38) 


kw1., 

_ uL 


m 
Z 

k=0  1-v.  ..  .-v.  ... 

i+lj  1+10 


m 
=  Z 
k=0 


kwiik<1-'V"io><1-Vir'W 

l-U-Ujj-Uy,)  (l-Vi+lj-Vi+10)k 


k-l 


(39) 
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iiL 


m  w'. 

Z  — 

k=0  l-u..-u.- 

ij  lO 


m  w'  m 

+  L  -U.k-  =  Z 


w. ..  (l-v.  -v  r 
nkv  i+i  vi+i(y 


k  =  0  l-u.0  k=0  l-(l-“ij-ui0)(l-vi+lj-vitl0)k 


m 
+  Z 
k=0 


”iik(1-vjt10> 

l-(l-Ui0)(l-v.+10)! 


(40) 


m  kw'  m  kw... 

I  - -  +  E  - at.  =  E 

k=0  l-v..-v.„  k-0  l-v.  „  k=0 


ij  iO 


iO 


kw...  (1-U..-U.JI1-V  -v  ) 
i.l kl  ij  iO__  i+l  ^+10* 

l-() -u.  .-u._  )(l-v  -v  ^ 
ij  iOm  i+l  vi+10; 


k-1 


m 
+  L 
k=0 


kwiikil-uio>t1-''itio> 

i-a-ui0)(i-vi+10)' 


k-1 


(41) 


However,  when  (37)  and  (38)  are  multiplied  byl-u..-u  and 

ij  iO 

1-Vi«j-VitlO  the*  become  <12>  “d  I*3)  »“»  *'ijk.  wijk.  ujj+ui0»  vi+lj+vi+io 
replacing  V  Pi’  and  q.^  respectively. 

Thus  if  one  takes  into  account  observations  corresponding  only 
to  the  two  highest  values  of  k,  r  and  s,  one  obtains  from  (19-22). 


u. .  +u.n  =  1  - 
ij  iO 


.^f — ( 


w! . 


-r 

s-r 


v  w. .  +w! . 

\  1JS  1JS 


(42) 
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1 

s-r 


V.  .1  •  +  V-_Lir,  =1  - 

x-J-lj  i+10 


w!.  (w. .  +w!.  ) 
ijsv  ljr  ijr' 

„  w'. .  (w. .  +w! .  ) 
L  ijr  ijs  ijs 


(43) 


Subtracting  (38)  and  (39)  from  (40)  and  (41)  and  then  multiplying 
through  by  l-u.0  and  l-v.+1()  yield  (12)  and  (13)  with  wLk,  w.^,  u.q,  and 

Vi+10  rePlacin8  bici»  Pji  and  9i+1  respectively.  Thus,  after 

considering  only  r  and  s  as  values  of  k,  one  obtains  from  (19-22) 
and  (42)  and  (43). 


w!. 

....  Ur 


w. 


iJr 


+  w'.. 


ijr 


w  l . 

—11 L 


w. .  +  Wl . 
IJS  IJS 


-r 

s-r 


(44) 


vi+10  *  1  ' 


wl .  (w! .  +w. .  ) 

11  a  iir  ij  r' 

w! .  (w! .  +w..  ) 

ijr  ijs  ijs 


1 

s-r 


(45) 
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L_  1 


w'. .  (wl.  +w . .  ) 
ijs  ijr  nr' 

w!.  (w..  +wl.  ) 
ns  nr  nr' 

w1. .  (wl .  +  w. .  ) 

L  ijr'  IJS  ij s '  J 

wl .  (w. .  +  w! .  ) 

L  ijr'  ijs  Ijs'  J 

(47) 


In  a  similar  manner  one  could  also  obtain  100(1-0 )  percent 
confidence  intervals  for  u..,  v  ,  u^,  and  v.+1Q  by  noting  that 

100(1- o)  percent  confidence  intervals  for  u. .  +  u.„  and  v  +  v 

ij  i0  l+lj  i  +10 

are  also  100(l-a)  percent  confidence  intervals  for  u. .  and  v.  and 

ij  i+lj 

making  the  appropriate  substitutions  in  (23-30).  Note  that  the  length 
of  these  confidence  intervals  tends  to  zero  as  the  sample  size  for 
both  k=r  and  k=s  tend  to  infinity  and  consequently  the  estimates  in 
(44-47)  are  consistent. 

In  (44-47)  the  weights  are  based  on  the  assumption  that  all 
weights  except  for  question  j  are  zero.  For  the  more  general  case 
where  this  is  not  assumed,  it  is  suggested  that  the  weights  used  be  the 
average  of  the  results  in  (44-47)  taken  over  all  j.  That  is: 


uio  =  1  -  -=■ 

n 


(48) 


Vi+10  =  1  "  -  ? 

n  J 


wl.  (wl .  +w. .  ) 
ijb'  ijr  Vi  r 

w! .  (wl .  +  w. .  ) 
iJr  ijs  ijs'  J 


1 

s  -  r 


(49) 
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*1  p 


w'  (w!.  +  w. .  ) 
_^jsv  ijr  nr; 

wi.  (w! .  +w. .  ) 
!Jr  1JS  ljs' 


,  wi  (w. .  +wi.  ) 

1JS  ijr  nr' 

n  w'.  (w. .  +wi.  ) 

ijr'  ijs  ijs' 


.  j/0  (51) 


where  n  is  the  number  of  questions  on  the  questionnaire. 
IV.  EXAMPLE 

Consider  the  three  level  problem  with  the  following  data. 

{cl»c2»c3)  =  (6*  7*  4,3» 

(pl»p2*P3)  =  (-07,  .23,  .46) 

(q2*q3)  =  (-6,  .8) 

It  then  follows  that  V^)  =  6.  7/pr 


* 
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For  level  2,  substitute  the  appropriate  values  for  Cj,  c2,  p^  q2  in 
(7)  to  obtain  f(k,  Pj)  =  .995(.4)k  -  1.498  +1.15/.4k"1 

This  yields  f(l,  p^  =  .  050,  f(2,  pJ)  =  1.  536.  Since  the  latter  exceeds 
one,  no  more  than  one  successful  performance  at  level  2  will  ever  be 
required.  This  yields  the  following  table  for  V,  (p). 


i 

n(i) 

f 

0 

0 

.000 

1 

1 

.050 

Table  1:  Output  for  level  2 

This  table  may  be  read  as  follows.  For  0  <  ^  <  .050,  require  no 
successful  performances  at  level  2.  For  jx,  >  .  050,  require  one 
successful  performance  at  level  2. 

For  level  3,  note  from  (8)  that 

f 3 <k “  1  *  1,  P)  =  9/  [V2(p,  1  -.  77(.  2)k"1)-  V2(p,  1-.  77(. 2)k)]  . 

In  order  to  find  f3(0,l,p),  V2(p,  1-.  77(.  2)°)  =  V2<p,.  23)  and 

V2^P»  !-•  77(. 2))  =  V2(p,  .846)  must  be  calculated  by  algorithm  2.  For  the 

calculation  of  V(p,.23),  k(2)  =  1  by  table  1.  Thus  ^  =  l-(.4)  (.93)  =  .628 

and  V(p,  .23)  -  4.3/.  23  +  6.7/.628  =  29.37.  Similarly,  V(p,  .  846)  =  15.  75 
Thus  f 3 ( 0 , 1,  p)  —  1,  9/(29,  37  -  15,  75)=  ,  140,  Similarly,  V2(p,  1-.  77(. 2)2)  = 

V2(p, .  969)  =  15.11  and  f(l,2,p)  =  2.97.  For  k  >  3,  V2(p,  1  -.  77(.2)k_1)  < 

V2(p,.  969)  =  15.11  and  V2(p,  l-.77(.2)k)  >  V2(p,i)  =  14.97.  Thus 
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no  more  than  1 


f(k-l,k,p)  >4.3/(15.11  -  14.97)  =  30.71  >  1.  Thus 

successful  performance  a.  level  3  can  be  required,  yielding  the 
following  table. 


i 

n(i) 

f 

0 

0 

.  000 

1 

1 

.140 

Table  2:  Output  for  level  3. 


Table  1  and  2  contain  the  information  necessary  to  use  algorithm  2 
to  calculate  V<.  07.  .23.  .46).  From  table  2.  It, 3,  =  1  since  . .40  <  .46  <  1. 
Thus  P;,  =  1  -  .  77{.  2)  =  .846.  From  table  1,  k(2)  =  1  since  .140  <  .  628  <  1 


and  Pl  =  1-  .  93(.4)  =  .  628.  Thus  one  successful  performance  is  required 
at  each  level  and  the  expected  time  to  complete  the  course  is 
V(.07,  .23,. 46)  =  6.  7/.  628  +  4. 3/.  846  +  1.9/.46  =  19.88. 
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List  of  Symbols 


P  =  (pj»  •  •  •  *  Pj^) 


state  vector  where  is  the  probability 
student  can  perform  at  level  i. 


V(ir,  p) 

V„(p) 

v^lp) 

n'i'' 


f(k,  px) 

fn(j.k,p) 


probability  student  can  perform  at  level 
i-1  given  that  he  performs  successfully 
at  level  i  and  could  not  previously  per¬ 
form  successfully  at  level  i-1. 


expected  cost  under  policy  it  when  the 
system  is  in  state  p. 


minimum  cost  for  state  p  if  we  restrict 
instruction  to  levels  1, .  .  .  ,  n. 


same  as  vn  (p)  except  that  exactly  k 

successful  performances  at  level  n  are 
required. 

the  value  of  pn  such  that  v£(p)  =  vj-1(p) 
for  fixed  p^. 

the  value  of  pn  such  that  V^(p)  =  V^(p),  j 


*ki 


number  of  incorrect  responses  at  level  i 
following  k  correct  responses  at  level  i  + 

number  of  correct  responses  at  level  i 
following  k  correct  responses  at  level  i  + 


L(p,q) 

w. 

J 


w 

w 


ijk 

ijk 


the  likelihood  function  of  the  vector  (p,  q). 
score  on  question  j  . 

weight  of  question  j  upon  p^. 
weight  of  question  j  upon  q^ 
as  defined  in  text. 

as  defined  in  text. 

as  defined  in  text, 
as  defined  in  text. 


<k. 
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